AITopics | part-of-speech tagging

Collaborating Authors

part-of-speech tagging

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Comparative Analysis of LLM Adaptation: SFT, LoRA, and ICL in Data-Scarce Scenarios

Bohnet, Bernd, Dangovski, Rumen, Swersky, Kevin, Moore, Sherry, Chaudhry, Arslan, Kenealy, Kathleen, Fiedel, Noah

arXiv.org Artificial IntelligenceNov-5-2025

The remarkable capabilities of Large Language Models (LLMs) often need to be tailored for specific applications, requiring the integration of new knowledge or the acquisition of new skills. While full fine-tuning is a powerful adaptation method, it is computationally expensive and can lead to a degradation of general reasoning abilities, a phenomenon known as catastrophic forgetting. A range of alternative techniques exists, each with its own trade-offs. In-Context Learning (ICL) is fast but limited by context length, while Parameter-Efficient Fine-Tuning (PEFT) methods like Low-Rank Adaptation (LoRA) offer a middle ground by minimizing parameter changes. However, the challenge of catastrophic forgetting persists, raising questions about the best adaptation strategy for a given task. This paper presents a comparative analysis of Supervised Finetuning (SFT), LoRA, and ICL in data-scarce scenarios. We find that LoRA provides the most effective balance, successfully instilling new skills with minimal impact on the base model's general knowledge. In contrast, while SFT excels at skill acquisition, it is highly susceptible to catastrophic forgetting. ICL is effective for incorporating factual knowledge but struggles with complex skills. Our findings offer a practical framework for selecting an LLM adaptation strategy. We highlight the critical distinction between skill acquisition and knowledge integration, clarify the trade-offs between task-specific performance and the preservation of general capabilities.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2511.0013

Genre: Research Report > New Finding (0.66)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

ACO-tagger: A Novel Method for Part-of-Speech Tagging using Ant Colony Optimization

Mohammadi, Amirhossein, Hajiaghajani, Sara, Bahrani, Mohammad

arXiv.org Artificial IntelligenceMar-27-2023

Swarm Intelligence algorithms have gained significant attention in recent years as a means of solving complex and non-deterministic problems. These algorithms are inspired by the collective behavior of natural creatures, and they simulate this behavior to develop intelligent agents for computational tasks. One such algorithm is Ant Colony Optimization (ACO), which is inspired by the foraging behavior of ants and their pheromone laying mechanism. ACO is used for solving difficult problems that are discrete and combinatorial in nature. Part-of-Speech (POS) tagging is a fundamental task in natural language processing that aims to assign a part-of-speech role to each word in a sentence. In this research paper, proposed a high-performance POS-tagging method based on ACO called ACO-tagger. This method achieved a high accuracy rate of 96.867%, outperforming several state-of-the-art methods. The proposed method is fast and efficient, making it a viable option for practical applications.

evolutionary algorithm, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2303.1676

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
North America > Mexico > Quintana Roo > Cancún (0.04)
Asia > Japan > Hokkaidō > Hokkaidō Prefecture > Sapporo (0.04)

Genre: Research Report > Promising Solution (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
(2 more...)

Add feedback

Reliable Part-of-Speech Tagging of Historical Corpora through Set-Valued Prediction

Heid, Stefan, Wever, Marcel, Hüllermeier, Eyke

arXiv.org Machine LearningOct-20-2020

Syntactic annotation of corpora in the form of part-of-speech (pos) tags is a key requirement for both linguistic research and subsequent automated natural language processing (nlp) tasks. This problem is commonly tackled using machine learning methods, i.e., by training a pos tagger on a sufficiently large corpus of labeled data. While the problem of pos tagging can essentially be considered as solved for modern languages, historical corpora turn out to be much more difficult, especially due to the lack of native speakers and sparsity of training data. Moreover, most texts have no sentences as we know them today, nor a common orthography. These irregularities render the task of automated pos tagging more difficult and error-prone. Under these circumstances, instead of forcing the pos tagger to predict and commit to a single tag, it should be enabled to express its uncertainty. In this paper, we consider pos tagging within the framework of set-valued prediction, which allows the pos tagger to express its uncertainty via predicting a set of candidate pos tags instead of guessing a single one. The goal is to guarantee a high confidence that the correct pos tag is included while keeping the number of candidates small. In our experimental study, we find that extending state-of-the-art pos taggers to set-valued prediction yields more precise and robust taggings, especially for unknown words, i.e., words not occurring in the training data.

prediction, set-valued prediction, tagger, (16 more...)

arXiv.org Machine Learning

2008.01377

Country:

Europe > United Kingdom > England > Greater Manchester > Manchester (0.04)
Europe > Portugal > Porto > Porto (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Part-of-Speech Tagging

#artificialintelligenceSep-8-2019, 16:34:29 GMT

Rule-Based: A dictionary is constructed with possible tags for each word. Rules are either hand-crafted, learned or both. An example rule might say, "If an ambiguous/unknown word X is preceded by a determiner and followed by a noun, tag it as an adjective." Statistical: A text corpus is used to derive useful probabilities. Given a sequence of words, the most probable sequence of tags is selected.

artificial intelligence, machine learning, stochastic method, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.43)

Add feedback

Part-of-Speech Tagging with PowerShell

@machinelearnbotDec-20-2017, 03:45:46 GMT

When analyzing text, a common goal is to identify the parts of speech within that text – what parts are nouns? To accomplish this goal, the area of natural language processing in Computer Science has developed systems for Part of Speech tagging, or "POS Tagging". The default English model is 97% correct on known words, and 90% correct on unknown words. "SpeechTagger" is a PowerShell interface to this tagger By default, Split-PartOfSpeech outputs objects that represent words and the part of speech associated with them. This is sometimes useful for regular expressions, or for adapting code you might have previously written to consume other part-of-speech taggers.

artificial intelligence, natural language, part-of-speech tagging, (2 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)

Add feedback

Global and Local Approach of Part-of-Speech Tagging for Large Corpora

Yu, Shi (University of Chicago) | Grossman, Robert (University of Chicago) | Rzhetsky, Andrey (University of Chicago)

AAAI ConferencesNov-5-2012

We present Global-Local POS tagging, a framework to train generative stochastic Part-of-Speech models on large corpora. Global Taggers offer several advantages over their counter parts trained on small, curated corpus, including the ability to automatically extend and update their models to new text. Global Taggers also avoid a fundamental limitation of current models, whose performance heavily relies on curated text with manually assigned labels. We illustrate our approach by training several Global Taggers, implemented with generative stochastic models, on two large corpora using high performance computing architecture. We further demonstrate that global taggers can be improved by incorporating models trained on curated text, called Local Taggers, for better tagging performance derived from specific topics.

data mining, machine learning, tagger, (19 more...)

AAAI Conferences

2012 AAAI Fall Symposium Series

Country: North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback